Diversity analysis: Richness versus evenness

Abstract Richness and evenness, two important components of diversity, have been the subject of numerous studies exploring their potential dependence or lack thereof. The results have been contradictory and inconclusive, but tending to indicate only a low (positive or negative) correlation. While such reported studies have been based on particular data sets and species abundance distributions, the present article provides the results of a study using randomly generated abundance distributions and hence more generalizable findings and valid statistical results. The results reveal no statistically significant correlation between richness and evenness based on such random sample of abundance distributions and on four well‐known measures of diversity, including Simpson's indices and the entropy index. Of the two diversity components, evenness is found to have the strongest influence on diversity, but for numbers‐equivalent or effective‐number formulations, richness tends to be the most influential diversity component. For analyzing the tradeoff between richness and evenness for any given diversity measure and abundance distribution, the richness‐evenness curve is introduced as a new tool for diversity analysis.


| INTRODUC TI ON
Diversity is generally considered to consist of two components: richness and evenness.In biology and ecology, richness typically means the number of different species in a sample or population while evenness refers to the extent to which the different species are equally represented in the sample (population).Diversity increases as the number of species increases and as their relative proportions become increasingly equal or uniform.As stated by Magurran (2004, p. 9), "Species richness is simply the number of species in the unit of study" while "evenness describes the variability in species abundances."A diversity index or measure is a statistic that incorporates information about both richness and evenness.
For a sample, collection, or unit of study with a relative abundance distribution P n = p 1 , p 2 , … , p n where p i is the proportion (sample probability) of the i-th species and all p i 's sum to 1 (i.e., ∑ n i=1 p i = 1), the species richness is simply defined as n.The species evenness refers to how evenly (uniformly) the p i 's are distributed, but its measurement is not so simple and remains unsettled (see, for example, Kvålseth, 2015;Smith & Wilson, 1996).With diversity being considered a combination of richness and evenness, a wide variety of diversity indices have been proposed over the years (e.g., Daly et al., 2018;Magurran, 2004, ch. 4-5).In spite of such a variety of ways to measure and interpret diversity, it certainly would be informative to explore whether any kind of relationship exists between the richness and evenness components.Such information could provide for a better understanding of diversity as a concept and of its measurement.
In fact, the relationship between richness and evenness, if any, has been the subject of considerable interest, controversy, and numerous studies and publications.Some have emphasized that richness and evenness should be independent components (e.g., Heip, 1974;Peet, 1974: Smith & Wilson, 1996) while Jost (2010) has argued that they cannot possibly be independent.Others have studied the potential richness-evenness relationship and generally found limited interaction or low correlations between evenness and richness (e.g., Blowes et al., 2022;Bock et al., 2007;Buzas & Hayek, 2005;Gosselin, 2006;Liu et al., 2023;Ma, 2005;Stirling & Wilsey, 2001;Yan et al., 2023;Zhang et al., 2012).Soininen et al. (2012) conducted a meta-analysis of various studies and found that "significant correlations of species richness and evenness only existed in 71 out of 229 datasets.Eighty-nine were negative and 140 were positive (p. 803)."In a study by Su (2018), the indication is that any richness-evenness relationship will depend on the form of the relative abundance distribution.
The various reported studies exploring a potential relationship between richness and evenness have been based on data collected from particular ecological systems and sites.Ma (2005), for example, studied plant species in different field quadrats in Finland as basis for relative abundance distributions.As another example, Su (2018) used sample data for island birds, stream fishes, and zooplankton in specific locations.In all of these studies, not only did the richness and evenness components vary, but the values of the diversity indices also varied.Another complicating factor in understanding the relationship between richness and evenness is that different measures of diversity and of evenness were used in different studies.
Although the results from individual studies of a potential richness-evenness relationship or dependence are restricted to the particular ecological systems and species abundance distributions, some generalization may be possible because of the substantial overall reported data sets from varying ecological environments.
The meta-analysis by Soininen et al. (2012) was one such attempt at a generalization with no consistent result, but rather a mixture of negative, positive, or insignificant correlations between richness and evenness.Such lack of consistent results or association between richness and evenness as diversity components also highlights the important point that the frequently used richness by itself is an incomplete measure of diversity.
One way to use a more general data base than those based on particular ecological systems for exploring potential richnessevenness relationships is the use of randomly generated relative abundance distributions P n = p 1 , … , p n where the richness n and each p i (i = 1, … , n) are obtained by random number generation.
Random sample data can then be used, for example, to test whether any statistically significant correlation exists between richness and evenness.This approach is one of the objectives of the present article.Such random sample data will also be used to assess the potential associations between different diversity measures and between different evenness measures.
Another objective of this article is to determine analytically the relative effects of richness versus evenness on the values of specific diversity measures.Four well-known diversity measures will be considered, including Simpson's index and the entropy index.
Besides analyses of potential relationships between richness and evenness based on data for which diversity is also a variable, a further objective of this article is to present a method for considering the richness-evenness relationship for fixed diversity.With richness and evenness being components of diversity, one could consider the tradeoff between those two components for any given value of the diversity.Thus, for any given relative abundance distribution P n = p 1 , … , p n and some diversity index with the value D P n , one could consider other distributions P m with the same diversity value, that is, D P n = D P m , but with different richness (m ≠ n) and different evenness.
Such richness-evenness tradeoff relationships or graphical curves for fixed diversity will necessarily depend on the diversity measure being used as will be exemplified and illustrated in this article.Although intuitively rather simple to comprehend as a general concept, rigorous analysis and description of the relationship between richness and evenness for a given value of a diversity measure will require some mathematical formulations.Those developments will also emphasize the validity of the formulations.Real biological data will be used as numerical examples.

| Definitions
In the most general terms, consider the case of n mutually exclusive and exhaustive categories with the respective probabilities or proportions p 1 , p 2 , … , p n with each p i ≥ 0 and ∑ n i=1 p i = 1.In biology or ecology, P n = p 1 , … , p n becomes the abundance distribution for n different species, with n typically being referred to as the species richness.For a generic diversity measure D, the value E P n of the corresponding evenness index E for the distribution P n can be defined as the following normalized form of D (Kvålseth, 2015): involving the degenerate and uniform distributions.
The term D P 1 n in (1) becomes a function of the richness n.In its most general form, the diversity value D P n as a function of D * P n and n can be expressed from (1) as (1) For the apparently most popular diversity indices, (3) reduces to the following expressions: for Simpson's index (Simpson, 1949), for the entropy due to Shannon (1948), for the second form of Simpson's index, and, for the exponential form of the entropy in (5), apparently first proposed by Sheldon (1969), While the above expressions involve the general distribution P n = p 1 , … , p n , another special distribution that will be useful in the subsequent analysis is the lambda distribution introduced by Kvålseth (2011) and defined as follows: where is an evenness parameter.This P n is a so-called mixture distribution, being the weighted mean of the extreme distributions P 0 n and P 1 n in (2), that is, For the analysis of some diversity index D, the utility of P n comes from the fact that for any P n , as exemplified next.
In terms of the notation used in this article, it should be noted that the strictly mathematically correct notation would be to use D to denote a diversity function and D P n to denote its value for the distribution P n as used above.However, for the sake of simplicity and where there is no chance of ambiguity, D may sometimes be used both as a function and its numerical value.The same comment applies to other summary measures used in the article.

| Value validity
For any measure of evenness, as with summary measures in general, it is essential that all values of a measure provide true, realistic, or valid representations of the attribute being measured, that is, the evenness characteristic.The conditions for such value-validity property, first introduced by Kvålseth ( 2014), have been discussed in detail for evenness indices by Kvålseth (2015).
As a brief outline here, the value-validity condition for the normalized diversity index D * as a measure of evenness can be derived as follows.Consider first the distribution P n in (8) and its extreme members in (2) as points (vectors) in n-dimensional Euclidean space, with D * P 0 n = 0 and D * P 1 n = 1.Then, in terms of the Euclidean distance function d, the evenness parameter can be expressed in terms of metric distances as follows: That is, equals the relative extent to which the Euclidean distance between P n and P 1 n is less than its maximum distance.The value-validity condition on the diversity D requires that or as an approximation.For the general distribution P n = p 1 , … , p n and from (10), the condition in ( 12) becomes or approximately so, with P n substituted for P n in (11).
Therefore, according to ( 12) and ( 13), D * P n and D * P n measure the relative proximity of P n and P n to the complete evenness distribution P 1 n based on Euclidean distances.For example, for the simple distribution P 1∕2 2 = (0.75, 0.25) = 1 2 P 0 2 + 1 2 P 1 2 , (12) requires that which is clearly a most logical result.Nevertheless, none of the diversity indices in ( 4)-( 7) satisfies this condition (Kvålseth, 2015).However, as discussed below, those indices can be corrected so as to comply.

| R ANDOM SAMPLE RE SULTS
The wide variety of reported biological studies of the potential relationships between richness and evenness have involved various types of species and environments.Consequently, the results from such studies apply to those specific situations and may not be generalizable to other situations.In order to explore some more general data, distributions P n = p 1 , … , p n were generated randomly using the computer algorithm described in Kvålseth (2015).Thus, the richness n ∈ 2, 100 and the value of each p i were generated as random numbers within given intervals.The results are summarized in Table 1.
While some of the measures included in Table 1 will be defined in subsequent derivations, one conclusion that can be drawn from the data for the diversity measures in ( 4)-( 7) is that no apparent relationship seems to exist between richness and evenness.Based on the statistical results for the Pearson correlation coefficient (r) for Data Sets 1-4 in Table 2, the absolute values of the t-statistic are all less than the critical value t 28, ∕2 = 2.048 for the significance level (3) = 0.05 so that the null-hypothesis of zero (population) correlation cannot be rejected.Similarly, from the four p-values in Table 2, there is sufficient evidence to conclude that the correlation between richness and evenness for each of the diversity measures in (4)-( 7) is not significantly different from zero.Of course, zero correlation does not necessarily imply statistical independence since other than linear relationships may exist between evenness and richness.
However, from the data in Table 1, no other potential relationship seems plausible.
It is important to point out that even though the values of the different evenness indices are found to be highly correlated as shown in Table 2 (Data Sets 5-8), their individual values can differ greatly as seen from the results in Table 1.For example, when com- Comparative results such as these give reason for concern when using even some of the most popular measures of diversity and evenness.However, such concern is alleviated in the subsequent richness-evenness analysis, where value-validity corrections are being applied to the evenness indices.

| General tradeoff formulation
While the preceding analysis is concerned with potential relationships between richness and evenness, or lack thereof, when based on varying distributions P n = p 1 , … , p n , consider now the The most obvious answer to this question would seem to lie in the definition in (3).Thus, for any given D P n , one could simply replace the P n on the right side of (3) with any other distribution P m = p 1 , … , p m such that D P m = D P n and then solve the resulting equation for D * P m as a function of m.However, since the evenness indices in (4)-( 7) do not meet the value-validity condition in (12) (Kvålseth, 2015), an alternative approach should be considered.
In terms of the lambda distribution P n in (8) and a generic diversity measure D, the relationship in (10) can be generalized such that for any given P n , the given value D P n can equal D P m for various unique (m, )-pairs.There would necessarily be a certain restriction on m depending upon the value D P n .For a chosen m, the D P m becomes a function of that can be solved for as the proper evenness value.The resulting may conveniently be denoted by D * C P m , or simply D * C , to indicate that the normalized D * has been corrected to satisfy the value-validity condition in (12).This procedure may be summarized as follows: for any given where f is a function of D P n and m.The formulation in (14) holds for any ∈ 0, 1 and all m subject to a restriction m ≥ m 0 D P n that depends on the value D P n and consequently on the form of the diversity measure as will be exemplified next for the diversity measures in (4)-( 7).By varying m and D * C in ( 14), the results can also be represented graphically as richness-evenness curves, or R-E curves, for potentially interesting and useful diversity analysis.

| Simpson's measures
For Simpson's index in (4), it follows from ( 14) and P m defined in (9) that which, solved for as the value-validity corrected evenness D *

, gives
This relationship holds for all D * SC ∈ 0, 1 and m ≥ 1 ∕ 1 − D S P n (for the square root to be defined).
Pearson's correlation coefficients (r) and the t-statistics for the null hypothesis that the population correlation = 0 (versus ≠ 0) as well as the p-values based on the data in Table 1.McIntosh (1967) where N is the sample size, and 1 − � ∑ n i=1 p 2 i proposed by Junge (1994).As an example of this fact, consider the form D S2 P n of Simpson's index in (6) for which with P m defined in (8).Solving this expression for as the valuevalidity corrected evenness = D * S2C , it is readily seen that D * S2C as a function of D S2 P n and m is exactly the same as that of ( 16) with Examples of such curves are given in Figure 1

| Entropy measures
In order to obtain the R-E curve for the commonly used entropy H P n in (5), it becomes immediately clear that setting H P n = H P m for any given abundance distribution P n and the lambda distribution P m in (8) cannot be readily solved for = H * C as required by ( 14).One could, of course, use a search procedure to obtain all combinations for m and such that H P n = H P m for any given H P n .
Alternatively, as a more convenient and practical approach, good approximate results can be obtained from the following formulation by Kvålseth (2014) As examples of R-E curves for H P n involving real biological data, those used for D S P n and represented by Figure 1 will also be used for H P n with the respective values H P 9 = 0.88, H P 14 = 1.65, and H P 20 = 2.41.The three curves are given in Figure 2.

| Comments on the R-E curves
The general richness-evenness (R-E) curve is based on evenness D * C P m , or simply D * C , as a function of the richness m for some given (fixed) entropy value D P n as expressed in ( 14).Some general properties of an R-E curve are as follows.First, each point (m, D * C ) on the curve has the same entropy value D P n .In order to produce such a curve, which is somewhat analogous in shape to the indifference curve used in economics (e.g., Varian, 2010), only the diversity value D P n and the form of D need to be known.Second, the R-E curve is convex (i.e., bowed toward the origin).Third, the R-E curve has a negative slope.Fourth, since the function f in ( 14) is assumed to be a single-valued function, R-E curves cannot intersect.Fifth, curves with increasing distance from the origin represent increasing diversity.
These properties are all rather evident from the real data curves in Figures 1 and 2.Although the general shapes of the curves are quite similar for Simpson's D S in (4) and the entropy H in (5), their specific details clearly differ considerably.Even though one of the properties of R-E curves is that they cannot intersect for the same diversity measure, they can for different diversity measures as is apparent from Figures 1 and 2. Thus, for instance, the middle curves for D S P 14 = 0.69 and H P 14 = 1.65 are seen to intercept approximately at the point when m = 8 and D * SC P 8 = H * C P 8 = 0.55.At this crossover point, the D S P 14 and F I G U R E 2 Richness-evenness curves for the entropy index from (17) for the same three different abundance distributions as those used in Figure 1 (see text).Magurran (2004, pp. 237-238).Comparing the D * SC values for the three points with m = 20 shows how the differences between the three diversity values (0.38, 0.69, and 0.88) are due to the equal differences between the respective evenness values of 0.23, 0.48, and 0.73 as seen from Figure 1 or computed from (16).
As stated at the beginning of Section 4, the tradeoff between richness and evenness for any given diversity value D P n could also be considered in terms of (3) as and then simply determine D * P m as a function of m.The resulting richness-evenness curves would at least resemble in form those of D * C versus m as presented above and would be entirely appropriate if D * P m is assumed to be an acceptable evenness measure.However, since this assumption can indeed be challenged for various diversity measures (Kvålseth, 2015), the value-validity corrected evenness measures D * C P m are used as a more appropriate representation.Also, while the value-validity requirement is discussed quite concisely in this article, more detailed explanations are given by Kvålseth (2011Kvålseth ( , 2014Kvålseth ( , 2015)).

| Relative effects of richness and evenness
It is rather evident from the data in Table 1 that evenness generally contributes more toward diversity than does richness or that diversity is more sensitive to changes in evenness than to changes in richness.Since the range of variation of those two characteristics differs greatly, one reasonable way to compare their effects on diversity is to consider relative changes in diversity related to relative changes in evenness and richness by a method analogous to partial elasticity widely used in economics (e.g., Varian, 2010, pp. 274-291).
Thus, in terms of relative changes and partial derivatives, one can define the following measure for D S in (4): which measures the sensitivity of D S to a (relative) change in D * S while keeping richness n fixed.Also, by treating n as a continuous variable for purely mathematical purpose and by similarly defining En as in (18), the following result is obtained: That is, for n > 2, D S is more sensitive to changes in evenness than to changes in richness (n), especially for large n.
Similarly, for the entropy measure in (5) and the equivalent relative change expressions for H * and n to that of (18), it is determined that which shows that, for n > 2, the entropy index is more sensitive to changes in H * than to changes in n.When comparing ( 19) and (20), it would seem that such differences between evenness and richness in their effects on diversity are more pronounced in the case of D S in (4) than of H in (5).
In terms of the same definition as in (18), the following result is obtained for D S2 in ( 6) and H 2 in (7): The inference from ( 21) is that both diversity measures D S2 and H 2 tend to be somewhat less sensitive to changes in evenness than to changes in richness, but only marginally so when n is large.

| CON CLUS ION
There are three main findings from this analysis.First, when considering the results from randomly generated abundance distributions P n = p 1 , … , p n for the four well-known diversity measures in (4)-( 7), the conclusion is that there is no statistically significant correlation between the richness and evenness components of those diversity

ACK N OWLED G M ENTS
None.

FU N D I N G I N FO R M ATI O N
None.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The author declares no conflict of interest.
(0.40, 0.30, 0.15, 0.10, 0.05)  for which D S P 5 = 0.72 so that from (16), as the tradeoff relationship between the richness m and evenness D * SC for the fixed diversity value D S P 5 = 0.72.The resulting graph of D * SC as a function of m for m ≥ 1 ∕ (1 − 0.72) = 3.57 (or 4) becomes the R-E (tradeoff) curve for D S P 5 = 0.72.By choosing, for instance, the richness value m = 10, the corresponding evenness value (point) along the R-E curve would be D * SC = 1 − √ 1 − 10(0.72)∕ 9 = 0.55.Other diversity measures that are strictly increasing functions of D S P n will necessarily have the same richness-evenness curve as that of D S P n for the same P n .Such diversity measures include D S2 P n in (6), the statistical odds measure D S2 P n − 1 by Kvålseth (1991), − log ∑ n i=1 p 2 i proposed by Pielou (1977), (14) D P n = D P m ⟹ = D * C = f D P n , m (15) based on real data from Magurran (2004), with D S P 9 = 0.38 (p.243, "Unburned forest"), D S P 14 = 0.69 (p.243, "Burned chaparral"), and D S P 20 = 0.88 (pp.237-238, "Derrycunnitry oakwood").Those species abundance distributions cover a wide range of diversity values.These curves cover richness-values for m ≤ 30, although the asymptotic values of D * SC P m are seen from (16) to be 1 − √ 1 − D S P n as m → ∞.The curves are presented as being continuous, but they are obviously most meaningful for integer values of m.
have the same richness and evenness components, otherwise they differ for this P 14 -distribution.For m > 8, the rate of change of evenness with increasing m is clearly greater for H P 14 than for D S P 14 , whereas for m < 8, this rate of change is more comparable.One of the most striking characteristics common to all of these R-E curves is the rather dramatic negative slopes for the smaller m values where the evenness values approach unity as m approaches the respective lower limits of 1 ∕ 1 − D S P n and exp H P n .That is, for a given or constant diversity value D P n and for any other distribution P m = p 1 , … , p m such that D P m = D P n , the evenness is most sensitive to changes in m for small m-values.The fact that the actual richness m = n and evenness D * C P n is a single point on an R-E curve representing a given diversity value D P n , such as the points identified in Figures 1 and 2, indicates the considerable amount of potentially interesting and useful information available in such a curve.The use of such information depends, of course, on the particular interest of a user or researcher in any given situation.With all points along an R-E curve having the same diversity, perhaps the single most useful aspect lies in the ability to see how the diversity components, richness and evenness, can be traded off and still produce the same diversity as that of the original data set.Far from being independent, richness depends entirely on evenness, or vice versa, for any given R-E curve.While an individual R-E curve can provide information about the potential richness-evenness tradeoff characteristic for a particular data set or diversity value, different R-E curves can also be used for comparing the diversity values for different data sets or distributions P n when controlling for either richness or evenness.As an example, consider the three curves in Figure 1 and control for m by looking at the point on each curve with say, m = 20.For the top curve, this point corresponds to the diversity D S P 20 = 0.88 for the real P 20 from measures.Second, in spite of such lack of association between richness and evenness across abundance distributions, one can analyze D P m = D P 1 m − D P 0 m D * P m + D P 0 m = D P n richness and evenness for any given P n and any diversity index by means of the richness-evenness curve introduced above.Third, when considering relative changes in the values of a diversity measure as the result of a relative change in richness (with evenness kept fixed) versus a relative change in evenness (with richness kept fixed), D S in (4)and H in (5) are found to be more sensitive to changes in evenness than to richness, with the reverse finding for D S2 in (6) and H 2 in (7).The richness-evenness (R-E) curve provides a new tool for analyzing diversity.Such a curve can provide interesting and useful information about the potential tradeoff between the richness and evenness components of a given value of any diversity measure.It also has potential utility when comparing the diversity values for different abundance distributions and when comparing the behavior of different diversity measures.AUTH O R CO NTR I B UTI O N S Tarald O. Kvålseth: Conceptualization (equal); data curation (equal); formal analysis (equal); funding acquisition (equal); investigation (equal); methodology (equal); project administration (equal); resources (equal); software (equal); supervision (equal); validation (equal); visualization (equal); writing -original draft (equal); writing -review and editing (equal).
Similarly differing results are seen when comparing the diversity values in Table 1.Although the correlation coefficients in Table 2 (Data Sets 9-12) are quite impressive, different diversity measures can produce substantially different results for the same distributions P n = p 1 , … , p n .For example, for D S2 in (6) and H 2 in (7), which both take on values within the same [1, n]-interval, it is found that RMSE D S2 , H 2 = 15.18.
Sample values from randomly generated distributions P n = p 1 , … , p n of the measures D S and D * S defined in (4), D S2 and D * S2 in (6), H and H * in (5), and H 2 and H * 2 in (7).
following question: what are the relative effects of richness and evenness on a given diversity measure for any individual distribution P n ?That is, what is the tradeoff between richness and evenness for any given diversity value D P n ?Or, how can different combinations of richness and evenness produce the same given D P n -value?

:
All combinations of m and H * C in (17) have the same entropy value H P n , with m ≥ exp H P n and H * C being the (approximate) valuevalidity corrected evenness index for H P n .C can differ considerably.As with alternative diversity measures that are strictly increasing functions of D S P n , those that are strictly increasing functions of H P n such as H 2 P n in (7) will have the same R-E curve as that of H P n for any given P n .
* SC from (16) are D * SC P m = 0.68 and 0.55, indicating that individual values of D * SC and H *